This document illustrate how we can improve ShinyApp Style, and performance. The assignment consists of developing a shiny app that tracks encounter animals and plants species in the world. The dataset comes from two csv large files (4G, 2.4G) which can not open with an ordinary computer.
To demonstrate how we can improve recativities and play around skilling shiny app, We will walk through survey sections and try to site concrete example.
library(remotes)
library(tictoc)
library(RSQLite)
library(tidyverse)
library(geojson)
library(geojsonio)
library(data.table)
library(profvis)
library(DT)
library(promises)
library(future)The Shiny App named biodiversity I that developed during assignment, is a full R package that responds to CRAN/Bioconductor criteria. It is named biodiversity. It uses a map with multiple layers. The species are marked with different color in the map. User can view/hide any kingdom, search any species using keywords and the app return matched species and focus on select one.
To install and run biodiversity shiny app, just run these code, or try the demo
Screenshot of biodiversity
As we said, the dataset is two csv files with about 6.5G. The first option to deal with this, is to convert the csv files to sqlite database for example.
In reality, using DB instead csv file is not enought to scale the App to 1000s of users. It is importante use Faster function, like here:
apply family versus loopingcon <- DBI::dbConnect(RSQLite::SQLite(), "../../../DATA/Concours/Appsilon/biodiversity/biodiversity/inst/biodiversity/extdata/biodiversity.db")
countries_list <- NULL
tic(msg = "### LOOPING PROCESS ###")
for (i in DBI::dbListTables(con)){
countries_list[[i]] <- tbl(con,i) %>% as_tibble()
}
toc()## ### LOOPING PROCESS ###: 4.341 sec elapsed
countries_list <- NULL
tic(msg = "### LAPPLY PROCESS ###")
countries_list <- lapply(DBI::dbListTables(con), function(x) tbl(con,x) %>% as_tibble)
toc()## ### LAPPLY PROCESS ###: 3.641 sec elapsed
In general it is better to use apply family function instead looping.
In our case, we need to load geo.json file map. Mainly, this kid of files have a metadata that can to become heavier. Here, we compared to file with different metadata countain. The first one is heavy and located at /extdata folder, and the other loaded directly from the url source.
## Load Map SLOW
tic(msg = "### From File ###")
countries_map <- geojson_read("../../../DATA/Concours/Appsilon/countries.geojson", what = "sp")
toc()## ### From File ###: 4.16 sec elapsed
## Load map FASTER
tic(msg = "### From link ###")
countries_map <- geojson_read("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json", what = "sp")
toc()## ### From link ###: 1.877 sec elapsed
Nice! In the next steps we will improve the map loading by Caching.
We can compare multiple ways to filter our data. Here we tried three methods (grepl, %in%, str_detect, data.table, and DT) and we will select the faster one.
set.seed(34)
countries <- c("Poland", "Switzerland")
tic("### USING grepl ###")
biodiversity_data<- countries_list %>%
rbindlist() %>%
filter(grepl(paste0(countries, collapse = "|"), country, ignore.case = TRUE))
toc()## ### USING grepl ###: 1.951 sec elapsed
tic("### USING %in% ###")
biodiversity_data <- countries_list %>%
rbindlist() %>%
filter( country %in% countries )
toc()## ### USING %in% ###: 0.722 sec elapsed
tic("### USING str_detect ###")
biodiversity_data <- countries_list %>%
rbindlist() %>%
filter( str_detect(country, countries))
toc()## ### USING str_detect ###: 0.777 sec elapsed
tic("### USING data.table ###")
biodiversity_data <- countries_list %>%
rbindlist() %>% as.data.table()
biodiversity_data <- biodiversity_data[country == paste0(countries, collapse ="|")]
toc()## ### USING data.table ###: 1.209 sec elapsed
tic("### USING DT ###")
biodiversity_data <- countries_list %>%
rbindlist() %>% as.data.table()
biodiversity_data <- biodiversity_data[grepl(paste0(countries, collapse ="|"), country)]
toc()## ### USING DT ###: 1.6 sec elapsed
While adding countries to the database, the app become slow, mainly during ploting the map. The bottleneck become bigger when there are more and more circles markers (encounters) to plot on the map.
Note
In the case of biodiversity package, the loading inputs (Map and Tables), the Processing, and the building of the map with circles seems not to be the bottleneck. It should be the displaying process of the screen. We will process to analyse this hypothesis. We will use profvis package to screen when the code takes a lot of memory.
## Loading required package: shiny
##
## Attaching package: 'shiny'
## The following objects are masked from 'package:DT':
##
## dataTableOutput, renderDataTable
## The following object is masked from 'package:geojsonio':
##
## validate
## sourcing frontPage: 0.78 sec elapsed
## sourcing frontPage_ui: 0.004 sec elapsed
## [1] "NEW QUERY OF: Poland"
## Loading Map for Poland: 0.622 sec elapsed
## Loading Table of Poland: 0.218 sec elapsed
## data Processing of Poland: 0.67 sec elapsed
## Building the Map of Poland: 0.909 sec elapsed
## [1] "The biodiversity App is closed."
The blue color in the profiling indicates that the output$wotldMap is the most heavy computing.
We will try to Analyze thoroughly using tictoc.
User can select a country to focus on and search for species. When the user wants to iterate the choice of the country, a popup appears and wait for input country. Here we can catch previous plot and wait if the user/users (session/application level) reselect the same country, then the app can return it without computing.
bindCache() of renderleaflet()We just added %>% bindCache(vals$countries, cache = "session") to the end of renderleaflet().
vals <- reactiveValues(countries = NULL)
## Listening OK button
observeEvent(input$ok, {
if (!is.null(input$countries_id) && nzchar(input$countries_id)) {
vals$countries <- input$countries_id
removeModal()
} else {
showModal(popupModal(failed = TRUE))
}
})
output$worldMap <- renderLeaflet({
...
for (i in input$countries_id){
#Put each table in the list, one by one
table_list[[i]] <- tbl(con,i) %>% as_tibble()
...
}
}) %>% bindCache(vals$countries, cache = "session")Note
Adding %>% bindCache(vals$countries, cache = "session") to the rendereaflet(), we catche its all processes ran with a specific vals$countries. If the user select a new vals$countries, le computing will be done, if not the app returns catched object. Concretely, we do not view progress bar for loading data, preprocessing and ploting the map.
We used a caching at session level. We can generalize the cache at the application level and memorize the object for multiple user by setting %>% bindCache(vals$countries, cache = "app").
## sourcing frontPage: 0.066 sec elapsed
## sourcing frontPage_ui: 0.025 sec elapsed
## [1] "NEW QUERY OF: Poland"
## Loading Map for Poland: 0.325 sec elapsed
## Loading Table of Poland: 0.192 sec elapsed
## data Processing of Poland: 0.669 sec elapsed
## Building the Map of Poland: 0.772 sec elapsed
## [1] "NEW QUERY OF: Poland"
## [1] "NEW QUERY OF: Switzerland"
## Loading Map for Switzerland: 0.343 sec elapsed
## Loading Table of Switzerland: 0.558 sec elapsed
## data Processing of Switzerland: 1.709 sec elapsed
## Building the Map of Switzerland: 1.886 sec elapsed
## [1] "NEW QUERY OF: Switzerland"
## [1] "NEW QUERY OF: Poland"
## [1] "The biodiversity App is closed."
Steps of this demo: Query Poland, Poland, Switzerland, Switzerland, Poland.
Only the first query on each country has computing steps.
You can see the Loading Map for … is done for Poland and Switzerland which is not necessary. In the next step, we will cache it by memoise.
Memoise functionsThe same result can be obtained by using memoise function. Here we will generate a BAD function inside renderLeaflet just to see how it works.
output$worldMap <- renderLeaflet({
leaflet_fun <- function(countries_id){
...
for (i in input$countries_id){
#Put each table in the list, one by one
table_list[[i]] <- tbl(con,i) %>% as_tibble()
...
}
}
## memorize at the session level
m_leaflet_fun <- memoise::memoise(leaflet_fun, cache = session$cache)
})
m_leaflet_fun(input$countries_id)Note
The argument used in memoise function is the input country (selected country). In this case, all processing will run only if a new country was selected by user. In the other hand, like in previous example bindCache(), each country will be proceed only the first time.
## sourcing frontPage: 0.07 sec elapsed
## sourcing frontPage_ui: 0.003 sec elapsed
## [1] "NEW QUERY OF: Poland"
## Loading Map for Poland: 0.409 sec elapsed
## Loading Table of Poland: 0.218 sec elapsed
## data Processing of Poland: 0.754 sec elapsed
## Building the Map of Poland: 0.795 sec elapsed
## [1] "NEW QUERY OF: Poland"
## [1] "NEW QUERY OF: Switzerland"
## Loading Map for Switzerland: 0 sec elapsed
## Loading Table of Switzerland: 0.466 sec elapsed
## data Processing of Switzerland: 1.531 sec elapsed
## Building the Map of Switzerland: 1.904 sec elapsed
## [1] "NEW QUERY OF: Switzerland"
## [1] "NEW QUERY OF: Poland"
## [1] "The biodiversity App is closed."
Steps: Query Poland, Poland, Switzerland, Switzerland, Poland.
Only the first query on each country has computing steps.
To memoize the process at Application level we need to change the argument cache by m_leaflet_fun <- memoise::memoise(leaflet_fun, cache = getShinyOption("cache"))
Reading geojson map takes a while to load. And, it is reloaded in each country which is not necessary. To memoize the map we can do this:
geojson_read_fun <- function(url){
#withProgress(message = 'Loading Map ...', value = 20, {
## Load Map source : https://datahub.io/core/geo-countries#r
#countries_map <- geojson_read("extdata/countries.geojson", what = "sp")
## Load map Faster
geojson_read(url, what = "sp")
#})
}
m_geojson_read_fun <- memoise::memoise(geojson_read_fun) tic(paste0("Loading Map for ", "Poland"))
countries_map <- m_geojson_read_fun("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json")
toc()## Loading Map for Poland: 0.349 sec elapsed
Note
Nice! If you can see in the last run of biodiversityMemoise::biodiversityMemoise(), the Loading Map… was consumed about 4.6~sec. The following query have elapsed time 0s.
Actually, I did not find a way to use promises and future to improve loading data or renderLeaflet. Since integrating promises to shiny is used generally within outputs, reactive expressions, and observers.
In our case, renderLeaflet is an exception of shiny Outputs. it exepects a value like renderText() or renderPlot() but needs two data type to plot the map (geo.json) and to enrich it with biodiversity data.
We can improve loading data or map, but we can not use them as a promise class. It take a long time to convert them. like:
tic("NORMAl Loding the Map " )
#countries_map <- geojson_read("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json",what = "sp")
countries_map <- geojson_read("../../../DATA/Concours/Appsilon/countries.geojson", what = "sp")
toc()## NORMAl Loding the Map : 3.028 sec elapsed
tic("FUTURE Loading the Map " )
countries_map <- future({
#countries_map <- geojson_read("https://raw.githubusercontent.com/johan/world.geo.json/master/countries.geo.json",what = "sp")
geojson_read("../../../DATA/Concours/Appsilon/countries.geojson", what = "sp")
})
toc()## FUTURE Loading the Map : 4.41 sec elapsed
## convert Map FUTURE using value(): 0.006 sec elapsed
tic("NORMAL Load data")
biodiversity_data <- read_rds("../../../DATA/Concours/Appsilon/biodiversity-data/full_data_Poland_Switzerland_Germany_France_Spain_USA.rds")
toc()## NORMAL Load data: 9.248 sec elapsed
tic("FUTURE Load data")
biodiversity_data_fu <-future({
read_rds("../../../DATA/Concours/Appsilon/biodiversity-data/full_data_Poland_Switzerland_Germany_France_Spain_USA.rds")
})
toc()## FUTURE Load data: 6.112 sec elapsed
tic("Converting data FUTURE using Value()")
biodiversity_data <- value(biodiversity_data_fu)
print(paste0("Table dimension: ", dim(biodiversity_data)))## [1] "Table dimension: 404411" "Table dimension: 15"
## Converting data FUTURE using Value(): 0.003 sec elapsed
Here an example of how we can play with css and js files to improve the beauty of Shiny App. Also we can add beauty documentation using markdown or Rmarkdown.
shinyUI(fluidPage(theme = shinytheme("flatly"), title = "Biodiversity", #superhero, flatly
# Add CSS files
includeCSS(path = "www/AdminLTE.css"),
includeCSS(path = "www/shinydashboard.css"),
tags$head(includeCSS("www/styles.css")),
## Include Appsilon logo at the right of the navbarPage
tags$head(tags$script(type="text/javascript", src = "logo.js" )),
## Include Biodiversity logo
navbarPage(title=div(img(src="biodiversity.png", height = "50px", widht = "50px",
style = "position: relative; top: -14px; right: 1px;"),
"Biodiversity"),
tabPanel("Globe",icon = icon('globe'),
div(class="outer",
tags$head(includeCSS("www/styles.css")),
uiOutput('ui_frontPage')
)),
navbarMenu("", icon = icon("question-circle"),
tabPanel("About",icon = icon("info"),
withMathJax(includeMarkdown("extdata/help/about.md"))
),
tabPanel("Performance",icon = icon("creative-commons-sampling"),
withMathJax(includeMarkdown("extdata/help/performance.md"))
),
tabPanel("Help", icon = icon("question"),
withMathJax(includeMarkdown("extdata/help/help.md"))),
tabPanel(tags$a(
"", href = "https://github.com/kmezhoud/biodiversity/issues", target = "_blank",
list(icon("github"), "Report issue")
)),
tabPanel(tags$a(
"", href = "https://github.com/kmezhoud/biodiversity", target = "_blank",
list(icon("globe"), "Resources")
))
)
)
))We can add a transparent layer with button, collapsable table and plotly. Like this piece of code to generate transparent panel with button over the Map:
column(width = 12,#style='height:200px',
div(class="outer",
tags$head(includeCSS("www/styles.css")),
leafletOutput("worldMap", height = "600px"),
absolutePanel(id = "panel_id", class = "panel panel-default",
top = 300, left = 20, width = 45, fixed=FALSE,
draggable = TRUE, height = 45 ,
# Make the absolutePanla collapsable
#HTML('<button data-toggle="collapse" data-target="#popup_id">Country</button> '),
#tags$div(id = 'popup_id', class="collapse",#style='background-color:transparent; border-color: transparent',
div(actionButton(inputId = "popup_id",label = "",
icon = icon("globe"),
style='background-color:transparent; border-color: transparent',),
style = "font-size:100%")
#)
),
)
)I can site a wrapper of zxing JS library to shinyApp. The goal is to read barcode using smartphone/laptop camera link.
The demo shows a complex DT table with multiple gadgets:
1- SelectInput in each row
2- buttons in each row to run barcode reader using camera.
user can scroll serial number or scan barcode to match the serial number. Each button is recognize its row_id.
Below the camera frame, the gadget returns the serial number of scanned barcode. I added a sound signal BIP to inform the user that is OK. The returned serial number goes automatically to specific cell in the DT.
Mainly, shiny Apps that I developed are a full R package with:
1- build, test, check processes during development
2- Functions documents
3- Vignette for user
If the task is a part of a big project hosted in github:
1- Clone the github:branch
2- Make changes and features for the solution
3- Build, Test, Check
4- Pull request
I’m using my background and examples that I developed during previous years.
Also I can use useful packages like: geolm, shinydashboard, shiny.sementic, shinymobile to accelerate prototyping.
A- Screenshots of 6 shiny Apps: Shiny Apps Screenshot.pdf
B- Examples of EDA :
2- Restaurants reservation and costumers behaviours
I used Spyder IDE to develop web app with python. I tried Django and Flash. But I am more comfortable with Rstudio and shiny.
2 - Survival Lung Cancer Event prediction
Local infrastructure
The infrastructure of shiny app is like R package. Maybe, to give support, please see the infrastructure of shiny app package develeped during assignment step (private but appsilon-hiring is collaborator) .
Deployment purpose:
1- When using shiny server (using local or cloud ubuntu server), the app can be installed as a package or deployed at /srv/shiny-server/app_name.
2- As a docker image: docker run -d -p 3838:3838 kmezhoud/biodiversity:0.1
3- As a kubernetes image
Deployment setting:
1- nginx or traefik
2- Port and redirection
3- firewall, iptable, and email alert
4- reverse proxy, and SSL domain certificate
5- keyring sensitive code like: login detail, IP address, tables and column names of the database.
I can site an example of shiny mobile app named mobi100c. It is developed to make easy the ordering processes between customers, retailers, and Stores with central Azure database and different login profiles. (login:WASSIM, pwd:1)
I am familiar with RStudio and RStudio server. I can install and use Rstudio server at digitalOcean.
I suppose like digialOcean.
I suppose like digialOcean.
I suppose like digialOcean.